-
Notifications
You must be signed in to change notification settings - Fork 128
Volume filtered Euler-Lagrange framework for particle-laden flow explicit unclosed term computation #1043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
2hrs for Frontier CI (MFlowCode#832)
Nitpicks 🔍
|
|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@voronoi/gen_voronoi_3D.py`:
- Around line 28-37: The current loop over points builds tetrahedra by taking
consecutive vertex triples (in the block that constructs tetrahedra then
computes centroids via compute_tetrahedron_centroid and volumes via
compute_tetrahedron_volume), which can produce invalid tetrahedra; replace that
tetrahedralization with a proper Delaunay decomposition of the vertex set (use
scipy.spatial.Delaunay on vertices[i] to get tetrahedra indices and then compute
centroids/volumes from those tetrahedra), or simpler and preferred, use freud's
precomputed values by replacing the per-cell centroid/volume computation with
voro_data.centroid[i] and voro_data.volumes[i] to get correct weighted centroids
and volumes. Ensure you remove the consecutive-triple logic and adapt downstream
code to consume the Delaunay-based tetrahedra or voro_data fields.
♻️ Duplicate comments (2)
voronoi/gen_voronoi_3D.py (2)
20-22: Avoid mutating the caller’s input array.
points = initial_pointsaliases the input, so the relaxation mutatesinitial_pointsand the “initial points” plot becomes incorrect. Copy the array before iterating.🛠️ Suggested fix
- points = initial_points + points = np.array(initial_points, copy=True)
61-63: Create the output directory recursively.
os.mkdirfails if../runsdoesn’t exist. Useos.makedirs(..., exist_ok=True)instead.🛠️ Suggested fix
- if os.path.exists(output_dir) == False: - os.mkdir(output_dir) + os.makedirs(output_dir, exist_ok=True)
🧹 Nitpick comments (2)
examples/3D_random_sphere_array/case.py (2)
1-3: Unused import:mathThe
mathmodule is imported but never used. All mathematical operations usenumpyinstead.Suggested fix
import json -import math import numpy as np
36-38: Consider adding basic validation for the sphere location file.If
sphere_array_locations.txtis missing or has fewer than 3 columns, the script will fail with an unclear error when accessingsphere_loc[i, 0:3]in the loop below. A brief check could improve usability:Suggested validation
# load initial sphere locations -sphere_loc = np.loadtxt("sphere_array_locations.txt") +sphere_loc = np.loadtxt("sphere_array_locations.txt", ndmin=2) +if sphere_loc.ndim != 2 or sphere_loc.shape[1] < 3: + raise ValueError("sphere_array_locations.txt must contain at least 3 columns (x, y, z)") N_sphere = len(sphere_loc)
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
CodeAnt AI Incremental review completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 issues found across 19 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/post_process/m_global_parameters.fpp">
<violation number="1" location="src/post_process/m_global_parameters.fpp:866">
P1: volume_filter_size is only computed for model_eqns == 2, but it is used for MPI allocations whenever q_filtered_wrt is true. If q_filtered_wrt is enabled with a different model, this uses an uninitialized size and can trigger invalid allocations.</violation>
</file>
<file name="src/simulation/m_global_parameters.fpp">
<violation number="1" location="src/simulation/m_global_parameters.fpp:1214">
P1: volume_filter_size is only computed for model_eqns == 2, but it is used for allocations whenever q_filtered_wrt is enabled. For other model_eqns values this leaves volume_filter_size uninitialized and can cause invalid allocation sizes at runtime.</violation>
</file>
<file name="src/simulation/m_additional_forcing.fpp">
<violation number="1" location="src/simulation/m_additional_forcing.fpp:168">
P2: Use total density for the energy forcing denominator. Dividing by only the first continuity field breaks multi-fluid cases and can cause divide-by-zero when that phase is absent.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/simulation/m_start_up.fpp (1)
1279-1352: Guard filtered-stat I/O to avoid uninitialized buffers.
q_filtered_wrtcan be enabled withoutvolume_filter_momentum_eqn, yet the save path updates/writesstat_*arrays regardless. That can dereference uninitialized/unallocated buffers or write garbage whent_step < t_step_start_stats. Gate the filtered I/O onvolume_filter_momentum_eqn(and the stats start step).🐛 Proposed fix
- if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then + if (q_filtered_wrt .and. volume_filter_momentum_eqn .and. & + t_step >= t_step_start_stats .and. (t_step == 0 .or. t_step == t_step_stop)) then $:GPU_UPDATE(host='[filtered_fluid_indicator_function%sf]') do i = 1, num_dims do j = 1, num_dims do k = 1, 4 $:GPU_UPDATE(host='[stat_reynolds_stress(i, j)%vf(k)%sf]') $:GPU_UPDATE(host='[stat_eff_visc(i, j)%vf(k)%sf]') end do end do end do do i = 1, num_dims do j = 1, 4 $:GPU_UPDATE(host='[stat_int_mom_exch(i)%vf(j)%sf]') end do end do do i = 1, E_idx do j = 1, 4 $:GPU_UPDATE(host='[stat_q_cons_filtered(i)%vf(j)%sf]') end do end do do i = 1, 4 $:GPU_UPDATE(host='[stat_filtered_pressure(i)%sf]') end do end if @@ - else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then + else if (q_filtered_wrt .and. volume_filter_momentum_eqn .and. & + t_step >= t_step_start_stats .and. (t_step == 0 .or. t_step == t_step_stop)) then call s_write_data_files(q_cons_ts(1)%vf, q_T_sf, q_prim_vf, save_count, bc_type, & filtered_fluid_indicator_function=filtered_fluid_indicator_function, & stat_q_cons_filtered=stat_q_cons_filtered, stat_filtered_pressure=stat_filtered_pressure, & stat_reynolds_stress=stat_reynolds_stress, stat_eff_visc=stat_eff_visc, stat_int_mom_exch=stat_int_mom_exch)
🤖 Fix all issues with AI agents
In `@src/common/m_mpi_common.fpp`:
- Around line 346-389: The code uses hardcoded index offsets when assigning
MPI_IO_DATA%var (e.g., sys_size + 1, +37, +73, +85, +105) which is fragile and
error-prone; replace those magic numbers by computing named offsets from
sys_size (or define constants) based on the sizes of the preceding blocks
(referencing filtered_fluid_indicator_function%sf, stat_reynolds_stress,
stat_eff_visc, stat_int_mom_exch, stat_q_cons_filtered, stat_filtered_pressure)
so the indexing is explicit and maintainable, and fix the view-creation loop to
start at sys_size + 1 (change do i = sys_size, alt_sys to do i = sys_size + 1,
alt_sys) to avoid overwriting the last base variable's view for
MPI_IO_DATA%view.
In `@src/post_process/m_data_input.f90`:
- Around line 243-301: The long Allocate statements in subroutine
s_allocate_filtered_arrays (e.g., allocations for
filtered_fluid_indicator_function%sf, stat_q_cons_filtered%vf(j)%sf,
stat_reynolds_stress%vf(k)%sf, stat_eff_visc%vf(k)%sf,
stat_int_mom_exch%vf(j)%sf) exceed line-length limits; break each long
allocate(...) line into multiple lines using Fortran continuation (&) after
commas or before the continued token, indent continuations for readability, and
ensure parentheses and array bounds (local_start_idx:end_x,
local_start_idx:end_y, local_start_idx:end_z) remain intact on the continued
lines so the allocate calls compile cleanly and satisfy Fortitude.
In `@src/simulation/m_volume_filtering.fpp`:
- Around line 214-216: The abort message incorrectly references "p" while the
check uses mod(Ny, num_procs); update the s_mpi_abort call in this conditional
(the call to s_mpi_abort inside the block that checks mod(Ny, num_procs) /= 0)
so the message refers to Ny (or "y-dimension" / "number of y points") being
divisible by num_procs rather than "p", e.g. change the error string to mention
Ny/rows or y-dimension divisibility by num_procs.
In `@toolchain/mfc/case_validator.py`:
- Around line 1104-1192: Strip all trailing whitespace characters from the
affected block (inside the methods check_periodic_forcing and
check_volume_filtering) so no lines end with extra spaces; edit the lines in
those functions' bodies and docstrings to remove trailing spaces and re-run lint
to confirm the Pylint trailing-whitespace warnings are resolved.
♻️ Duplicate comments (9)
src/simulation/m_data_output.fpp (3)
1018-1054: Avoid writing the beta slot twice in the filtered MPI loop.
Whenq_filtered_wrtis true andbetais present, the loop writessys_size+1and then the “Correction” block writes it again, risking overlap/corruption.🐛 Suggested fix
else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then do i = 1, alt_sys + if (present(beta) .and. i == sys_size + 1) cycle var_MOK = int(i, MPI_OFFSET_KIND) disp = m_MOK*max(MOK, n_MOK)*max(MOK, p_MOK)*WP_MOK*(var_MOK - 1) call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_DATA%view(i), & 'native', mpi_info_int, ierr) call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf, data_size*mpi_io_type, & mpi_io_p, status, ierr) end do
117-123: Serial path drops filtered outputs.
Whenparallel_iois false, the filtered/statistics arguments are ignored, soq_filtered_wrtcannot produce outputs in serial runs. Please extend the serial path or explicitly reject this configuration.💡 Suggested fix (also update s_write_serial_data_files signature)
- if (.not. parallel_io) then - call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta) - else + if (.not. parallel_io) then + if (q_filtered_wrt .and. present(filtered_fluid_indicator_function)) then + call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta, & + filtered_fluid_indicator_function, & + stat_q_cons_filtered, stat_filtered_pressure, & + stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) + else + call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta) + end if + else call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, & filtered_fluid_indicator_function, & stat_q_cons_filtered, stat_filtered_pressure, & stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) end if
957-963: Filtered MPI buffers can remain uninitialized.
s_initialize_mpi_data_filteredis only called underib, soq_filtered_wrtwithib = .false.(or file-per-process) leaves filtered views/buffers uninitialized. Move the call outside theibbranch or assert unsupported combinations.src/simulation/m_volume_filtering.fpp (3)
585-594: Finite-difference stencil can read out of bounds.
The loop runsi=0..m,j=0..n,k=0..pwhile indexingi+lwithl=-4..4, which can access beyond bounds near edges. Restrict loop bounds or use buffered indices.🐛 Suggested fix (bounds-safe loop)
- $:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]') - do i = 0, m - do j = 0, n - do k = 0, p + $:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]') + do i = 4, m-4 + do j = 4, n-4 + do k = 4, p-4 $:GPU_LOOP(parallelism='[seq]') do l = -4, 4 grad_fluid_indicator(1)%sf(i, j, k) = grad_fluid_indicator(1)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i + l, j, k)/dx(i) grad_fluid_indicator(2)%sf(i, j, k) = grad_fluid_indicator(2)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j + l, k)/dy(j) grad_fluid_indicator(3)%sf(i, j, k) = grad_fluid_indicator(3)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j, k + l)/dz(k) end do end do end do end do
785-786: Guard against zero filtered density.
q_cons_filtered(1)%sfcan be zero inside particles, causing division-by-zero/NaNs in Reynolds stress.🐛 Suggested fix
+ real(wp), parameter :: eps_rho = 1.0e-12_wp ... - reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) & - - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k)/q_cons_filtered(1)%sf(i, j, k)) + reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) & + - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k) / & + max(q_cons_filtered(1)%sf(i, j, k), eps_rho))
1553-1560: Filter back-transform divides by near-zero fluid indicator.
Several assignments divide byfiltered_fluid_indicator_function%sf, which can be tiny near/inside particles. Add an epsilon floor (or conditional zeroing) for stability.🐛 Suggested fix (apply to all similar divisions)
+ real(wp), parameter :: eps_phi = 1.0e-12_wp + real(wp) :: denom ... - q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/filtered_fluid_indicator_function%sf(i, j, k) + denom = max(filtered_fluid_indicator_function%sf(i, j, k), eps_phi) + q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/denomAlso applies to: 1602-1607, 1652-1656, 1704-1708
src/simulation/p_main.fpp (1)
59-64: Ensure volume-filtering resources are finalized.
Initialization is added here, but the matching finalization calls are not visible. Please confirm these are invoked ins_finalize_modules()whenvolume_filter_momentum_eqnis enabled.#!/bin/bash # Verify volume-filtering finalizers are called in shutdown paths. rg -n "s_finalize_.*filter|s_finalize_fftw_explicit_filter_module" src/simulation rg -n "volume_filter_momentum_eqn" src/simulation/m_start_up.fpp src/simulation/p_main.fppsrc/common/m_ib_patches.fpp (1)
617-633: Avoid shared module state inside the GPU loop (thread-safety).
s_convert_cylindrical_to_cartesian_coordwrites module-levelcart_y/cart_zinside a parallel region. This is the same race previously flagged; please make the conversion thread‑local (e.g., usingf_convert_cyl_to_cartand local temps) instead of module variables.src/common/m_mpi_common.fpp (1)
1604-1613: Slab decomposition validation relies on generic error message.The validation for slab decomposition divisibility is present via the
ierrflag check at lines 1731-1734, but the error message is generic ("Unsupported combination of values..."). Consider adding a more specific error message for the slab decomposition case to help users diagnose the issue. This is a duplicate of a past review suggestion.
🧹 Nitpick comments (6)
src/simulation/m_data_output.fpp (2)
85-115: Reduce argument count ins_write_data_files.
This signature now exceeds the ≤6-argument guideline; consider a derived-type params struct for the filtered/statistics payload. As per coding guidelines, ...
801-814: Reduce argument count ins_write_parallel_data_files.
This signature exceeds the ≤6-argument guideline; bundle the optional filtered/statistics fields into a derived-type params struct. As per coding guidelines, ...src/simulation/m_volume_filtering.fpp (1)
721-729: Trim parameter lists for filtering routines.
s_setup_terms_filteringands_filter_batchexceed the ≤6-argument guideline; wrap these inputs into a derived-type params object for maintainability. As per coding guidelines, ...Also applies to: 1037-1046
src/post_process/m_data_input.f90 (1)
664-669: Reduce argument count via a params struct.
s_read_parallel_filtered_datatakes 8 arguments, exceeding the project guideline (≤6). Consider a small derived type (e.g.,type(mpi_io_params)) to pass the MPI offsets as a bundle.As per coding guidelines, keep routine arguments ≤ 6 by grouping related parameters.
toolchain/mfc/run/case_dicts.py (1)
118-143: Large parameter dictionary expansion for IB patches.Expanding
ib_idrange from 10 to 1000 generates thousands of dictionary entries. While this aligns with the PR objective to support up to 1000 IB patches, consider the impact on:
- Schema validation performance (fastjsonschema compilation)
- Memory footprint of the parameter dictionaries
This may be acceptable for the use case, but worth noting for future optimization if startup time becomes an issue.
src/simulation/m_global_parameters.fpp (1)
1048-1050: Consider documenting the volume_filter_size formula.The formula
sys_size + 1 + 4*(2*num_dims**2 + num_dims + E_idx + 1)accounts for multiple filtered quantities with 4 statistical moments each, but the calculation is not self-documenting. Consider adding a comment explaining the breakdown:
sys_size: base conservative variables+1: filtered fluid indicator function4 * (...): 4 statistical moments for each filtered quantity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/simulation/m_data_output.fpp (1)
85-116: Reduce argument count via a params struct.This subroutine now exceeds the ≤6 argument guideline. Consider bundling the filtered/stat fields into a derived-type params struct (or module-level context) to keep the API stable and easier to evolve. As per coding guidelines, routine arguments should be ≤6.
🤖 Fix all issues with AI agents
In `@src/post_process/m_data_input.f90`:
- Around line 251-299: Several allocate(...) statements in
s_allocate_filtered_arrays (e.g. filtered_fluid_indicator_function%sf(...),
stat_q_cons_filtered(i)%vf(j)%sf(...), stat_reynolds_stress(i,j)%vf(k)%sf(...),
stat_eff_visc(i,j)%vf(k)%sf(...), stat_int_mom_exch(i)%vf(j)%sf(...)) exceed 100
chars and must be wrapped using Fortran continuation; split the long extent list
across lines with an ampersand at the end of the previous line and a leading
ampersand on the continued line so each allocate call stays under 100 chars,
preserving the full allocate(...) expression and indices exactly and keeping the
existing loop structure and indices intact.
In `@src/simulation/m_compute_statistics.fpp`:
- Around line 161-169: The subroutine
s_compute_statistics_momentum_unclosed_terms currently exceeds the ≤6-argument
guideline; create a derived type (e.g., type :: momentum_unclosed_params) that
bundles the unclosed-term fields (reynolds_stress, eff_visc, int_mom_exch,
q_cons_filtered, filtered_pressure) and replace those five arguments in
s_compute_statistics_momentum_unclosed_terms with a single intent(in) parameter
of that new type, keeping only the remaining scalar arguments (t_step,
t_step_start_stats) as separate arguments; update all call sites to construct
and pass the new params struct and adjust any internal references to use the
struct members.
In `@src/simulation/m_data_output.fpp`:
- Around line 801-814: The subroutine s_write_parallel_data_files currently
exceeds the ≤6-argument guideline; refactor by introducing a single derived type
(e.g. a params/holder type) to carry the optional filtered/stat fields
(stat_q_cons_filtered, stat_filtered_pressure, stat_reynolds_stress,
stat_eff_visc, stat_int_mom_exch and optionally
filtered_fluid_indicator_function and beta) and replace those separate optional
arguments with one optional instance of that derived type in the
s_write_parallel_data_files signature; update the type definition, adjust the
intent and optional attributes accordingly, and update all call sites to
construct/pass the new params object so the routine arguments are reduced to ≤6
while preserving intent(inout) semantics for the contained fields.
In `@src/simulation/m_global_parameters.fpp`:
- Around line 1048-1050: The computation of volume_filter_size is done too early
(when q_filtered_wrt is true) before additional contributions are added to
sys_size (surface tension, damage, chemistry), which can under-size MPI buffers;
move or recompute the volume_filter_size assignment so it runs after sys_size is
fully finalized (i.e., right before the MPI_IO_DATA allocation), keeping the
same conditional on q_filtered_wrt and using the final sys_size value; update
any duplicate early computations (also present around the other occurrence) to
the same post-finalization location to ensure buffers are sized correctly.
In `@src/simulation/m_start_up.fpp`:
- Around line 1152-1157: Before passing 1._wp/fluid_pp(1)%Re(1) into
s_volume_filter_momentum_eqn, validate that fluid_pp(1)%Re(1) is set and
strictly positive; if it is missing or <= 0, call s_mpi_abort with a clear error
message and stop execution. Update the block guarded by
volume_filter_momentum_eqn (the call to s_volume_filter_momentum_eqn) to perform
this check on fluid_pp(1)%Re(1) and only compute the reciprocal
1._wp/fluid_pp(1)%Re(1) when the value is valid; reference the symbols
fluid_pp(1)%Re(1), s_volume_filter_momentum_eqn and s_mpi_abort when making the
change.
In `@src/simulation/m_volume_filtering.fpp`:
- Around line 214-216: The abort message is misleading: the check compares Ny
against num_procs but the s_mpi_abort text refers to "p"; update the error text
to reference Ny (or "Ny" and "number of ranks") so it matches the condition.
Locate the conditional that uses Ny and num_procs in the routine (the if
statement and the call to s_mpi_abort) and change the message passed to
s_mpi_abort to something like "Volume filtering requires Ny to be divisible by
the number of ranks" (or similar phrasing referencing Ny) so logs accurately
reflect the checked dimension.
In `@toolchain/mfc/case_validator.py`:
- Around line 1104-1192: Trim all trailing whitespace characters from the newly
added validator methods check_periodic_forcing and check_volume_filtering (and
any adjacent lines in that diff); locate occurrences inside those functions
(including lines with calls to self.get(...), self.prohibit(...), and the
docstrings) and remove trailing spaces at line ends so the linter warning is
resolved.
♻️ Duplicate comments (18)
src/simulation/p_main.fpp (1)
25-64: Add finalization for volume-filtering resources.The new initializers are invoked under
volume_filter_momentum_eqn, but there’s no corresponding cleanup in the finalize path. Please add finalizers (or confirms_finalize_modulesalready handles them) to avoid leaking large allocations.src/common/m_mpi_common.fpp (1)
341-389: Fix alt_sys so filtered MPI views cover the full variable range.
alt_sysis set tovolume_filter_size, so the view-creation loop can miss appended variables. It should besys_size + volume_filter_sizeand the loop should start atsys_size + 1.🐛 Proposed fix
- alt_sys = volume_filter_size + alt_sys = sys_size + volume_filter_size ... - do i = sys_size, alt_sys + do i = sys_size + 1, alt_syssrc/post_process/m_data_input.f90 (1)
679-709: Fix alt_sys so MPI filtered reads include base + filtered variables.
alt_sysis set tovolume_filter_size, soNVARS_MOKand the read loop undercount when base variables are present. It should besys_size + volume_filter_size.🐛 Proposed fix
- alt_sys = volume_filter_size ! filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered, pressure + alt_sys = sys_size + volume_filter_size ! filtered indicator + stats appended to base varssrc/simulation/m_ibm.fpp (3)
620-630: Fix off-by-one in image-point search bounds.
index+1is accessed, so allowingindex==m+buff_sizecan read past the coordinate arrays.🐛 Proposed fix
- bound = m + buff_size + bound = m + buff_size - 1 ... - bound = n + buff_size + bound = n + buff_size - 1 ... - bound = p + buff_size + bound = p + buff_size - 1
648-651: Uses_mpi_abortinstead oferror stop.
error stopcan hang MPI ranks; switch to the MPI-aware abort. As per coding guidelines, uses_mpi_abortfor error termination.🐛 Suggested fix
- print *, "Increase buff_size further in m_helper_basic (currently set to a minimum of 10)" - error stop "Increase buff_size" + call s_mpi_abort("Increase buff_size further in m_helper_basic (currently set to a minimum of 10)")
1316-1320: Deallocate levelset fields even whenstore_levelset=F.
They are allocated unconditionally in initialization, so conditional deallocation leaks memory. As per coding guidelines, use the deallocation macro.🐛 Proposed fix
@:DEALLOCATE(ib_markers%sf) - if (store_levelset) then - @:DEALLOCATE(levelset%sf) - @:DEALLOCATE(levelset_norm%sf) - end if + @:DEALLOCATE(levelset%sf) + @:DEALLOCATE(levelset_norm%sf)src/post_process/m_global_parameters.fpp (2)
123-124: Ensurevolume_filter_sizeis initialized for allq_filtered_wrtcases.
It’s computed only undermodel_eqns == 2, but used for allocation wheneverq_filtered_wrtis true; ifq_filtered_wrtis enabled for another model, the size is undefined. Consider computing it aftersys_sizeis finalized for all models, or validate thatq_filtered_wrtis only used with model_eqns=2.♻️ Suggested placement (after sys_size is finalized)
- if (q_filtered_wrt) then - volume_filter_size = sys_size + 1 + 4*(2*num_dims**2 + num_dims + E_idx + 1) - end if ... + if (q_filtered_wrt) then + volume_filter_size = sys_size + 1 + 4*(2*num_dims**2 + num_dims + E_idx + 1) + end ifAlso applies to: 698-700, 865-868
1058-1062: Deallocate extra MPI_IO_DATA entries for filtered outputs.
Nulling pointers alone leaks the allocatedsfbuffers. As per coding guidelines, deallocate with the macro when associated.🐛 Proposed fix
if (q_filtered_wrt) then do i = sys_size + 1, volume_filter_size - MPI_IO_DATA%var(i)%sf => null() + if (associated(MPI_IO_DATA%var(i)%sf)) then + @:DEALLOCATE(MPI_IO_DATA%var(i)%sf) + end if end do end ifsrc/simulation/m_data_output.fpp (3)
117-123: Serial path drops filtered outputs whenparallel_iois false.
q_filtered_wrtdata never reachess_write_serial_data_files, so filtered outputs are silently skipped in serial runs. This matches a prior finding.🔧 Suggested fix
- if (.not. parallel_io) then - call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta) + if (.not. parallel_io) then + if (q_filtered_wrt .and. present(filtered_fluid_indicator_function)) then + call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta, & + filtered_fluid_indicator_function, stat_q_cons_filtered, stat_filtered_pressure, & + stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) + else + call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta) + end if else call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, & filtered_fluid_indicator_function, & stat_q_cons_filtered, stat_filtered_pressure, & stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) end if
957-963: Filtered MPI buffers are still only initialized inside theibbranch.When
q_filtered_wrtis true butibis false (or whenfile_per_processis true), the filtered MPI buffers/views remain uninitialized. This mirrors a prior finding.
1018-1054: Potential duplicate write ofbetaslot whenq_filtered_wrtis enabled.The
q_filtered_wrtloop writes indices1..alt_sys, which includessys_size+1whenpresent(beta)setsalt_sys = sys_size + 1, and thenbetais written again in the correction block. This is a previously reported overlap risk.src/simulation/m_additional_forcing.fpp (2)
111-115: Momentum-energy term still assumes 3 components.
momxb,momxb+1,momxb+2hard-code a 3D layout and break 1D/2D or nonstandard layouts. This matches a prior review note.
166-168: Energy forcing denominator is single-fluid and unguarded.Division by
q_cons_vf(contxb)can be zero and ignores multi-fluid mixture density, consistent with earlier review findings.src/simulation/m_global_parameters.fpp (1)
1048-1050:volume_filter_sizeonly set in model_eqns==2.If
q_filtered_wrtis enabled under other equation sets,volume_filter_sizecan be uninitialized and lead to invalid allocations. This repeats a prior review finding.src/simulation/m_compute_statistics.fpp (1)
239-246: Moments divide by zero whenns <= 3or variance is zero.The formulas can divide by
ns-1/2/3and byMsn(2)when the variance is zero. This matches a prior review finding.src/simulation/m_volume_filtering.fpp (3)
585-594: Stencil access can go out of bounds at domain edges.
i+l,j+l,k+lwill underflow/overflow near boundaries unless halos are ≥4. Restrict loop ranges or guard indices.
785-786: Guard division by filtered density.
q_cons_filtered(1)%sf(i,j,k)can be ~0 in solid regions, causing NaNs. Add a small floor.🐛 Suggested pattern
- reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) & - - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k)/q_cons_filtered(1)%sf(i, j, k)) + reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) & + - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k) & + / max(q_cons_filtered(1)%sf(i, j, k), eps_rho))
1553-1557: Protect divisions byfiltered_fluid_indicator_function.
The denominator can be 0 in solid cells, leading to NaNs forq_cons_filtered,filtered_pressure,reynolds_stress, andeff_visc. Add an epsilon or conditional guard.🐛 Suggested pattern (apply to all four sites)
- q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/filtered_fluid_indicator_function%sf(i, j, k) + q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm / & + max(filtered_fluid_indicator_function%sf(i, j, k), eps_phi)Also applies to: 1602-1606, 1652-1656, 1704-1708
🧹 Nitpick comments (4)
toolchain/mfc/case_validator.py (1)
1104-1155: Reduce local variable count incheck_periodic_forcingto satisfy pylint.
A small table-driven validation can drop several locals while keeping the same error messages.♻️ Possible refactor sketch
- u_inf_ref = self.get('u_inf_ref') - rho_inf_ref = self.get('rho_inf_ref') - P_inf_ref = self.get('P_inf_ref') - forcing_window = self.get('forcing_window') - forcing_dt = self.get('forcing_dt') + required = { + 'u_inf_ref': "u_inf_ref (desired bulk velocity) must be specified for periodic_forcing", + 'rho_inf_ref': "rho_inf_ref (desired bulk density) must be specified for periodic_forcing", + 'P_inf_ref': "P_inf_ref (desired pressure based on bulk internal energy) must be specified for periodic_forcing", + 'forcing_window': "forcing_window must be specified for periodic_forcing", + 'forcing_dt': "forcing_dt must be specified for periodic_forcing", + } + for key, msg in required.items(): + self.prohibit(self.get(key) is None, msg)src/simulation/m_ibm.fpp (1)
154-228: Add explicit private lists to periodic-buffer GPU loops.
If the macro doesn’t implicitly privatizej/k/l, these loops can race. Please verify and addprivate='[j,k,l]'(apply to all six loops). As per coding guidelines, declare loop-local variables in GPU parallel macros.♻️ Example (apply to all six loops)
- $:GPU_PARALLEL_LOOP(collapse=3) + $:GPU_PARALLEL_LOOP(collapse=3, private='[j,k,l]')src/simulation/m_volume_filtering.fpp (2)
721-729: Consider grouping arguments to keep subroutines ≤6 params.
Boths_setup_terms_filteringands_filter_batchexceed 6 arguments. A derived-type params struct would keep signatures manageable. As per coding guidelines, consider consolidating related fields.Also applies to: 1037-1045
92-95: Verify mixed‑precision slab buffers are intentional.
Slab buffers and MPI traffic usecomplex(c_float_complex)/MPI_COMPLEXwhile FFTs operate on double‑complex. This introduces implicit casts and potential precision loss. If double precision is required end‑to‑end, align buffer kinds and MPI datatypes.🔧 Possible alignment (if double precision is desired)
- complex(c_float_complex), allocatable :: data_cmplx_slabz(:, :, :), data_cmplx_slaby(:, :, :) + complex(c_double_complex), allocatable :: data_cmplx_slabz(:, :, :), data_cmplx_slaby(:, :, :) - call MPI_Alltoall(sendbuf_sf, NxC*Nyloc*Nzloc, MPI_COMPLEX, & - recvbuf_sf, NxC*Nyloc*Nzloc, MPI_COMPLEX, MPI_COMM_WORLD, ierr) + call MPI_Alltoall(sendbuf_sf, NxC*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, & + recvbuf_sf, NxC*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_WORLD, ierr)Also applies to: 895-897, 933-935, 973-975, 1015-1017
| subroutine s_compute_statistics_momentum_unclosed_terms(t_step, t_step_start_stats, reynolds_stress, eff_visc, int_mom_exch, q_cons_filtered, filtered_pressure) | ||
| type(scalar_field), dimension(num_dims, num_dims), intent(in) :: reynolds_stress | ||
| type(scalar_field), dimension(num_dims, num_dims), intent(in) :: eff_visc | ||
| type(scalar_field), dimension(num_dims), intent(in) :: int_mom_exch | ||
| type(scalar_field), dimension(sys_size), intent(in) :: q_cons_filtered | ||
| type(scalar_field), intent(in) :: filtered_pressure | ||
| integer, intent(in) :: t_step | ||
| integer, intent(in) :: t_step_start_stats | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Reduce argument count via a params struct.
This routine now exceeds the ≤6 argument guideline; consider packing the unclosed-term fields into a derived-type params struct. As per coding guidelines, routine arguments should be ≤6.
🤖 Prompt for AI Agents
In `@src/simulation/m_compute_statistics.fpp` around lines 161 - 169, The
subroutine s_compute_statistics_momentum_unclosed_terms currently exceeds the
≤6-argument guideline; create a derived type (e.g., type ::
momentum_unclosed_params) that bundles the unclosed-term fields
(reynolds_stress, eff_visc, int_mom_exch, q_cons_filtered, filtered_pressure)
and replace those five arguments in s_compute_statistics_momentum_unclosed_terms
with a single intent(in) parameter of that new type, keeping only the remaining
scalar arguments (t_step, t_step_start_stats) as separate arguments; update all
call sites to construct and pass the new params struct and adjust any internal
references to use the struct members.
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
CodeAnt AI Incremental review completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 16 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/simulation/m_ibm.fpp">
<violation number="1" location="src/simulation/m_ibm.fpp:70">
P1: Conditional allocation of levelset/levelset_norm leaves later unconditional uses on unallocated arrays when store_levelset is false, which will crash or corrupt memory in 3D runs.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs)) | ||
| @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, & | ||
| -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs, 1:3)) | ||
| if (store_levelset) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Conditional allocation of levelset/levelset_norm leaves later unconditional uses on unallocated arrays when store_levelset is false, which will crash or corrupt memory in 3D runs.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/simulation/m_ibm.fpp, line 70:
<comment>Conditional allocation of levelset/levelset_norm leaves later unconditional uses on unallocated arrays when store_levelset is false, which will crash or corrupt memory in 3D runs.</comment>
<file context>
@@ -67,10 +67,12 @@ contains
- -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs))
- @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, &
- -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs, 1:3))
+ if (store_levelset) then
+ @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, &
+ -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs))
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/simulation/m_ibm.fpp (1)
76-83: 2D case allocates levelset unconditionally, inconsistent with 3D and deallocation.When
p == 0(2D), lines 79-82 allocatelevelset%sfandlevelset_norm%sfunconditionally, but lines 1321-1324 only deallocate them whenstore_levelsetis true. This creates a memory leak in 2D whenstore_levelset = .false..🐛 Suggested fix
else @:ALLOCATE(ib_markers%sf(-buff_size:m+buff_size, & -buff_size:n+buff_size, 0:0)) - @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, & - -buff_size:n+buff_size, 0:0, 1:num_ibs)) - @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, & - -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3)) + if (store_levelset) then + @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, & + -buff_size:n+buff_size, 0:0, 1:num_ibs)) + @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, & + -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3)) + end if end if
🤖 Fix all issues with AI agents
In `@examples/3D_random_sphere_array/case.py`:
- Around line 36-38: The code assumes sphere_loc is 2-D but np.loadtxt can
return a 1-D array for single-row files; change the loading to ensure a 2-D
array and compute count from rows by wrapping the result with np.atleast_2d
(e.g., sphere_loc = np.atleast_2d(np.loadtxt(...))) and replace len(sphere_loc)
with sphere_loc.shape[0] so references like sphere_loc[i, 0] work safely.
- Around line 164-166: case_dict may contain NumPy scalar values from
computations (e.g., np.sqrt or array indexing) which json.dumps cannot
serialize; update the serialization to convert NumPy types to native Python
types by passing a default handler or by casting values before dumping. For
example, after case_dict.update(ib_dict) call json.dumps(case_dict,
default=lambda o: o.item() if hasattr(o, "item") else o) or pre-walk
case_dict/ib_dict to replace NumPy scalars with float/int via .item(); reference
the variables case_dict, ib_dict and the json.dumps call to locate where to
apply this change.
In `@src/common/m_ib_patches.fpp`:
- Around line 87-89: The initialization currently calls levelset routines
unconditionally for several geometry types and must be protected by the same
store_levelset guard used for spheres: wrap each call that computes levelset and
levelset_norm (e.g., calls like s_cuboid_levelset, s_cylinder_levelset,
s_airfoil3d_levelset and the 2D variants such as s_circle_levelset,
s_rectangle_levelset, s_airfoil2d_levelset, s_ellipse_levelset) in if
(store_levelset) then ... end if, so levelset/levelset_norm are only referenced
when they have been allocated; apply this change to every geometry branch in
m_ib_patches.fpp where levelset routines are invoked.
In `@src/post_process/m_data_input.f90`:
- Around line 535-537: Several statements in src/post_process/m_data_input.f90
(notably the conditional invoking s_read_parallel_filtered_data with
q_filtered_wrt, t_step and t_step_stop) exceed 100 characters; wrap those long
Fortran source lines using the continuation character (&) so they satisfy
Fortitude S001. Locate the long conditional and the long call to
s_read_parallel_filtered_data (and the other long lines around the ranges
mentioned: the block from ~671–703 and the line near ~762) and split argument
lists and logical expressions onto subsequent lines with trailing & on the
broken line and leading &/aligned indentation on the continuation lines,
preserving the original order of tokens and parentheses. Ensure no token is
split across lines and keep code readable by grouping related arguments (e.g.,
t_step checks on one line, the call arguments on subsequent lines) while keeping
each resulting line under 100 chars.
In `@src/pre_process/m_initial_condition.fpp`:
- Around line 102-107: The code allocates levelset and levelset_norm only when
store_levelset is true but later calls s_apply_ib_patches and other geometry
routines that unconditionally pass levelset/levelset_norm, causing potential
dereference of unallocated arrays; update callers (e.g., the s_apply_ib_patches
invocation that currently passes ib_markers%sf, levelset, levelset_norm) to
conditionally pass those optional arguments or use the alternate call when
store_levelset is false, and also wrap any computations that write/read
levelset/levelset_norm in an if (store_levelset) guard so non-spherical IBs
never touch unallocated levelset or levelset_norm storage.
In `@src/simulation/m_additional_forcing.fpp`:
- Around line 111-115: The kinetic energy term currently hardcodes three
momentum components (momxb, momxb+1, momxb+2); change it to compute ke by
looping over offsets 0..(num_dims-1) and summing
q_cons_vf(momxb+offset)%sf(i,j,k)**2, then divide by rho and use 0.5*ke in the
spatial_eps update (keep E_idx, dVol and rho usage). Declare and initialize a
local ke variable inside the GPU loop, add it to the loop's private clause, and
replace the explicit three-component expression with the loop-based ke
calculation so the code works for 1D/2D/3D.
- Around line 48-52: The calculation for domain_vol unconditionally uses
domain_glb(3,*) which causes out-of-bounds in 2D; update the block around
domain_vol to compute volume based on num_dims (use 3D product when num_dims==3,
2D product when num_dims==2, and 1D when num_dims==1) before computing
avg_coeff, ensuring domain_glb indices are only accessed for existing dimensions
and preserving the subsequent avg_coeff = 1._wp/(domain_vol*(1._wp -
volfrac_phi)).
♻️ Duplicate comments (15)
src/pre_process/m_data_output.fpp (1)
842-865: Fix MPI datatype/count for levelset and levelset_norm writes.
Usingmpi_pand raw element counts bypasses mixed-precision sizing; usempi_io_pand scale bympi_io_type.🐛 Proposed fix
- call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_levelset_DATA%view, & + call MPI_FILE_SET_VIEW(ifile, disp, mpi_io_p, MPI_IO_levelset_DATA%view, & 'native', mpi_info_int, ierr) - call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelset_DATA%var%sf, data_size*num_ibs, & - mpi_p, status, ierr) + call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelset_DATA%var%sf, data_size*num_ibs*mpi_io_type, & + mpi_io_p, status, ierr) … - call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_levelsetnorm_DATA%view, & + call MPI_FILE_SET_VIEW(ifile, disp, mpi_io_p, MPI_IO_levelsetnorm_DATA%view, & 'native', mpi_info_int, ierr) - call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelsetnorm_DATA%var%sf, data_size*num_ibs*3, & - mpi_p, status, ierr) + call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelsetnorm_DATA%var%sf, data_size*num_ibs*3*mpi_io_type, & + mpi_io_p, status, ierr)src/post_process/m_global_parameters.fpp (1)
1066-1070: Deallocate q_filtered_wrt MPI_IO buffers before nulling.
The extra stats slots are allocated but only nullified here, which leaks memory. Based on learnings, please use the device-aware deallocation macro.🐛 Proposed fix
if (q_filtered_wrt) then do i = sys_size + 1, sys_size + volume_filter_dt%stat_size - MPI_IO_DATA%var(i)%sf => null() + if (associated(MPI_IO_DATA%var(i)%sf)) then + @:DEALLOCATE(MPI_IO_DATA%var(i)%sf) + end if + MPI_IO_DATA%var(i)%sf => null() end do end iftoolchain/mfc/case_validator.py (2)
1111-1124: Remove trailing whitespace in the new validators (lint failure).
Pylint is failing on trailing whitespace in these added blocks; trimming resolves the pipeline error.Also applies to: 1162-1200
1194-1195: Fix the even-dimension parity check.
The current(dim + 1) % 2test rejects valid even sizes; check the dimension directly.🐛 Proposed fix
- self.prohibit((m_glb + 1) % 2 != 0 or (n_glb + 1) % 2 != 0 or (p_glb + 1) % 2 != 0, + self.prohibit(m_glb % 2 != 0 or n_glb % 2 != 0 or p_glb % 2 != 0, "volume_filter_momentum_eqn requires global dimensions divisible by 2 in every direction")src/common/m_mpi_common.fpp (1)
386-391: Avoid overwriting base MPI views when initializing filtered views.
The loop starts atsys_size, which can reinitialize the last base variable’s view. Start atsys_size + 1instead.🐛 Proposed fix
- do i = sys_size, alt_sys + do i = sys_size + 1, alt_sys call MPI_TYPE_CREATE_SUBARRAY(num_dims, sizes_glb, sizes_loc, start_idx, & MPI_ORDER_FORTRAN, mpi_p, MPI_IO_DATA%view(i), ierr) call MPI_TYPE_COMMIT(MPI_IO_DATA%view(i), ierr) end dosrc/common/m_ib_patches.fpp (1)
619-665: Avoid shared-state coordinate conversion inside the GPU loop.
s_convert_cylindrical_to_cartesian_coordwrites module-levelcart_y/cart_z, which is unsafe in parallel loops. Use the pure function with thread-local temporaries instead.🐛 Suggested fix (thread‑local conversion)
- if (grid_geometry == 3) then - call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k)) - else - cart_y = y_cc(j) - cart_z = z_cc(k) - end if + if (grid_geometry == 3) then + tmp_cyl = (/0._wp, y_cc(j), z_cc(k)/) + tmp_cart = f_convert_cyl_to_cart(tmp_cyl) + cart_y = tmp_cart(2) + cart_z = tmp_cart(3) + else + cart_y = y_cc(j) + cart_z = z_cc(k) + end ifsrc/simulation/m_data_output.fpp (2)
85-123: Serial I/O path does not forward filtered data arguments.When
parallel_iois false (Line 117-118), the call tos_write_serial_data_filesdoes not pass the new filtered arguments (filtered_fluid_indicator_function,stat_q_cons_filtered, etc.). Ifq_filtered_wrtis enabled with serial I/O, the filtered data will not be written.Either extend
s_write_serial_data_filesto accept and handle these arguments, or add a runtime check/abort whenq_filtered_wrtis true andparallel_iois false.
801-814: Argument count exceeds the ≤6 guideline.
s_write_parallel_data_filesnow accepts 10 arguments. As per coding guidelines, consider introducing a derived-type params struct to bundle the filtered/stat fields and reduce the signature complexity.src/simulation/m_ibm.fpp (2)
625-634: Off-by-one risk: loop accessess_cc(index + 1)but bound allowsindex == bound.The bound calculation at lines 627, 630, 633 sets
bound = m/n/p + buff_size, but the while-loop at line 648-656 accessess_cc(index + 1). Whenindex == bound,s_cc(index + 1)reads past the allocated array bounds.🐛 Suggested fix
if (dim == 1) then s_cc => x_cc - bound = m + buff_size + bound = m + buff_size - 1 elseif (dim == 2) then s_cc => y_cc - bound = n + buff_size + bound = n + buff_size - 1 else s_cc => z_cc - bound = p + buff_size + bound = p + buff_size - 1 end if
651-655: Replaceerror stopwiths_mpi_abortfor MPI safety.Using
error stopcan leave other MPI ranks hanging. As per coding guidelines, uses_mpi_abortfor error termination.🐛 Suggested fix
if (index < -buff_size .or. index > bound) then - print *, "proc_rank=", proc_rank, "temp_loc=", temp_loc, " index=", index, "ib=", patch_id, "dim", dim, "dir", dir, "i, j, k", i, j, k - print *, "Increase buff_size further in m_helper_basic (currently set to a minimum of 10)" - error stop "Increase buff_size" + call s_mpi_abort("Image point search exceeded buffer bounds. Increase buff_size (minimum 10).") end ifsrc/simulation/m_additional_forcing.fpp (1)
170-172: Division by zero risk when local density is zero.Line 172 divides by
rho, which is computed by summing fluid densities. At phase interfaces or in regions where all fluids are absent,rhocould be zero or near-zero, causing division by zero or numerical instability.🐛 Suggested fix
! f_E - q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt & - + q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*q_periodic_force(2)%sf(i, j, k)/rho + if (rho > 1.e-12_wp) then + q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt & + + q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*q_periodic_force(2)%sf(i, j, k)/rho + else + q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt + end ifsrc/simulation/m_volume_filtering.fpp (4)
214-216: Misleading error message: references 'p' but checks 'Ny'.The check validates
mod(Ny, num_procs)but the error message incorrectly states the requirement is forp.✏️ Suggested fix
if (mod(Ny, num_procs) /= 0) then - call s_mpi_abort('Volume filtering requires p to be divisible by the number of ranks') + call s_mpi_abort('Volume filtering requires Ny to be divisible by the number of ranks') end if
585-598: Critical: Out-of-bounds array access in finite-difference stencil.The loop iterates
i = 0, m,j = 0, n,k = 0, p, but the stencil accessesi + lwherel ∈ [-4, 4]. At boundary cells (e.g.,i = 0), this accessesfluid_indicator_function%sf(-4, j, k), causing out-of-bounds memory access unlessidwbuffprovides sufficient ghost cells.Verify that
idwbuffbounds provide at least 4 ghost cells in each direction, or restrict the loop ranges:🐛 Option 1: Restrict loop ranges (safer)
$:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]') - do i = 0, m - do j = 0, n - do k = 0, p + do i = 4, m - 4 + do j = 4, n - 4 + do k = 4, p - 4 $:GPU_LOOP(parallelism='[seq]') do l = -4, 4🔍 Option 2: Verify buffer allocation
#!/bin/bash # Check idwbuff allocation bounds to verify sufficient ghost cells rg -n 'idwbuff' --type-add 'fpp:*.fpp' --type=fpp -C3 | head -60
785-787: Potential division by zero on filtered density.When
q_cons_filtered(1)%sf(i, j, k)(filtered density) approaches zero inside solid regions, this division produces NaN/Inf values.🐛 Proposed fix with epsilon floor
+ real(wp), parameter :: eps_rho = 1.0e-12_wp reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) & - - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k)/q_cons_filtered(1)%sf(i, j, k)) + - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k) & + / max(q_cons_filtered(1)%sf(i, j, k), eps_rho))
1552-1560: Multiple divisions byfiltered_fluid_indicator_functionrisk NaN in solid regions.Lines 1556, 1605, 1655, and 1707 all divide by
filtered_fluid_indicator_function%sf(i, j, k), which is zero inside particles. This systematically produces NaN/Inf values throughout the filtered fields.Apply a consistent epsilon guard across all four locations:
🐛 Proposed pattern for all divisions
Define a module-level constant and apply uniformly:
real(wp), parameter :: eps_indicator = 1.0e-12_wpThen replace each division:
-q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/filtered_fluid_indicator_function%sf(i, j, k) +q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm & + / max(filtered_fluid_indicator_function%sf(i, j, k), eps_indicator)Apply the same pattern at lines 1605, 1655, and 1707.
🧹 Nitpick comments (8)
src/pre_process/m_global_parameters.fpp (1)
949-954: Use @:ALLOCATE for the new levelset MPI buffers.
These allocations are in a.fppmodule and should use the device-aware macro. As per coding guidelines, please prefer@:ALLOCATEhere.♻️ Suggested change
- allocate (MPI_IO_levelset_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs)) - allocate (MPI_IO_levelsetnorm_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs, 1:3)) + @:ALLOCATE(MPI_IO_levelset_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs), & + MPI_IO_levelsetnorm_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs, 1:3))src/post_process/m_global_parameters.fpp (1)
872-878: Use @:ALLOCATE for q_filtered_wrt MPI_IO_DATA buffers.
These new allocations are in a.fppmodule; using the device-aware macro keeps GPU builds consistent. As per coding guidelines, please prefer@:ALLOCATEhere.♻️ Suggested change
- allocate (MPI_IO_DATA%var(i)%sf(0:m, 0:n, 0:p)) + @:ALLOCATE(MPI_IO_DATA%var(i)%sf(0:m, 0:n, 0:p))src/post_process/m_data_input.f90 (1)
671-680: Consider bundling MPI I/O params into a derived type.
This signature has 8 parameters; grouping the MPI offsets into a small derived type would align with the ≤6-argument guideline and simplify call sites. As per coding guidelines, ...src/simulation/m_ibm.fpp (2)
713-723: Remove or enable commented-out code.Lines 714-723 contain commented-out alternative logic for 1D subsection checks. If this code is no longer needed, remove it to reduce clutter. If it's intended for future use, consider adding a TODO comment explaining its purpose.
805-811: Remove or enable commented-out code.Lines 806-811 contain the same commented-out 1D subsection logic pattern. This duplicates the commented code in
s_find_num_ghost_points. Either remove both or document the intent.src/simulation/m_volume_filtering.fpp (3)
339-343: Consider usingnewunit=for automatic unit assignment.Hardcoded unit number
100may conflict with other file operations in the codebase. Usingnewunit=provides safer automatic assignment.♻️ Suggested improvement
+ integer, save :: particle_force_unit ! file for particle forces if (compute_particle_drag) then if (proc_rank == 0) then - open (unit=100, file='particle_force.bin', status='replace', form='unformatted', access='stream', action='write') + open (newunit=particle_force_unit, file='particle_force.bin', status='replace', form='unformatted', access='stream', action='write') end if end ifThen update the
writeandclosestatements to useparticle_force_unitinstead of100.
1037-1765: Subroutines_filter_batchexceeds size guidelines (~730 lines).Per coding guidelines, subroutines should be ≤ 500 lines. This subroutine contains highly repetitive FFT transform patterns for different field types (conservative vars, pressure, Reynolds stress, effective viscosity, interphase momentum exchange).
Consider extracting reusable helper routines:
s_fft_forward_xy: Handles X FFT → reorder → Y FFT → store to batch slabs_fft_z_convolve: Handles Z FFT → convolution → inverse Z FFTs_fft_backward_yx: Handles Y inverse FFT → reorder → X inverse FFT → store resultThis would reduce
s_filter_batchto ~100-150 lines of orchestration logic and improve maintainability.
3-2015: File exceeds 1000-line guideline (~2015 lines).The coding guidelines recommend files ≤ 1000 lines. Consider splitting into focused submodules:
m_volume_filtering: Public API, orchestration, core filtering workflowm_volume_filtering_fft: FFT plans, forward/backward transforms, slab transposesm_volume_filtering_terms: Stress tensor, Reynolds stress, effective viscosity computations
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #1043 +/- ##
==========================================
- Coverage 44.16% 43.00% -1.17%
==========================================
Files 71 74 +3
Lines 20417 21923 +1506
Branches 1991 2067 +76
==========================================
+ Hits 9018 9428 +410
- Misses 10252 11308 +1056
- Partials 1147 1187 +40 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
| do j = 0, n | ||
| do i = 0, m | ||
| if (grid_geometry == 3) then | ||
| call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The call to s_convert_cylindrical_to_cartesian_coord writes module-level variables cart_y/cart_z; inside the GPU-parallel loop those names are declared private — the subroutine still updates the shared module variables, causing data races and incorrect values. Compute cartesian coordinates into the local/private variables directly inside the loop instead of calling the subroutine. [race condition]
Severity Level: Critical 🚨
- ❌ Sphere IB placement incorrect for periodic cylindrical grids.
- ⚠️ s_apply_ib_patches -> s_ib_sphere affected.
- ⚠️ Non-deterministic ib_markers_sf assignments observed.| call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k)) | |
| cart_y = y_cc(j)*sin(z_cc(k)) | |
| cart_z = y_cc(j)*cos(z_cc(k)) |
Steps of Reproduction ✅
1. Ensure there is a spherical IB patch: call path s_apply_ib_patches -> s_ib_sphere. In
src/common/m_ib_patches.fpp the call to s_ib_sphere is invoked from s_apply_ib_patches
(see the loop that calls `call s_ib_sphere(i, ib_markers_sf)` around lines added earlier
in the PR, e.g. ~line 86 in the new-hunk).
2. Enable periodic immersed boundaries and use cylindrical grid geometry: set
`periodic_ibs = .true.` and `grid_geometry == 3`. This causes s_ib_sphere to take the
periodic branch which contains the GPU loop starting at the $:GPU_PARALLEL_LOOP line shown
at src/common/m_ib_patches.fpp:622.
3. Enter the GPU-parallel loop (src/common/m_ib_patches.fpp:623-631). The loop privatizes
`cart_y`/`cart_z` locally, but calls `s_convert_cylindrical_to_cartesian_coord(y_cc(j),
z_cc(k))` which writes the module-level variables `cart_y` and `cart_z` (the subroutine is
defined in the same file: subroutine s_convert_cylindrical_to_cartesian_coord in
src/common/m_ib_patches.fpp).
4. Observe non-deterministic or incorrect ib_markers_sf writes (wrong cells marked)
because multiple threads execute the subroutine which updates the shared module variables,
producing a data race. Reproduced by running a parallel GPU/accelerator build with a
spherical IB in a cylindrical, periodic domain.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/common/m_ib_patches.fpp
**Line:** 627:627
**Comment:**
*Race Condition: The call to `s_convert_cylindrical_to_cartesian_coord` writes module-level variables `cart_y`/`cart_z`; inside the GPU-parallel loop those names are declared private — the subroutine still updates the shared module variables, causing data races and incorrect values. Compute cartesian coordinates into the local/private variables directly inside the loop instead of calling the subroutine.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, & | ||
| filtered_fluid_indicator_function, & | ||
| stat_q_cons_filtered, stat_filtered_pressure, & | ||
| stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The call to s_write_parallel_data_files unconditionally passes several optional dummy arguments (e.g. filtered_fluid_indicator_function, stat_q_cons_filtered, stat_filtered_pressure, stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) from s_write_data_files even when they might not have been provided to s_write_data_files. In Fortran you must not pass an optional actual argument that is not present; this can lead to undefined behaviour or runtime/compile problems. Fix by calling the parallel writer with the additional optional arguments only when at least one of them is present; otherwise call the routine with only the mandatory arguments. [possible bug]
Severity Level: Critical 🚨
- ❌ Parallel restart output may crash at s_write_parallel_data_files.
- ❌ MPI-file writes risk memory corruption.
- ⚠️ Filtered-statistics output unreliable or omitted.
- ⚠️ Affects any run using parallel_io and missing optionals.| call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, & | |
| filtered_fluid_indicator_function, & | |
| stat_q_cons_filtered, stat_filtered_pressure, & | |
| stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) | |
| if (present(filtered_fluid_indicator_function) .or. present(stat_q_cons_filtered) .or. & | |
| present(stat_filtered_pressure) .or. present(stat_reynolds_stress) .or. & | |
| present(stat_eff_visc) .or. present(stat_int_mom_exch)) then | |
| call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, & | |
| filtered_fluid_indicator_function, & | |
| stat_q_cons_filtered, stat_filtered_pressure, & | |
| stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) | |
| else | |
| call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta) | |
| end if |
Steps of Reproduction ✅
1. Enter subroutine s_write_data_files in module m_data_output
(src/simulation/m_data_output.fpp,
subroutine header and body span the PR hunk lines ~82-124). The conditional at line 117
chooses the parallel branch when parallel_io is true.
2. With parallel_io true the code executes the else-branch and reaches the call at lines
120-123 which unconditionally passes filtered_fluid_indicator_function and the other
filtered-statistics variables to s_write_parallel_data_files.
3. If the caller invoked s_write_data_files without supplying those optional actual
arguments
(they are absent in s_write_data_files), then the variables are absent optionals inside
s_write_data_files; passing an absent optional actual to another procedure is illegal
in Fortran. The callee s_write_parallel_data_files is defined in this file (see its
header around lines ~798-820).
4. Expected outcome: depending on compiler/ABI this can produce a compile-time diagnostic
or undefined runtime behaviour (crash, memory corruption) when executing the call at
src/simulation/m_data_output.fpp:120-123. The correct fix is to only forward optionals
when present (or provide a two-call conditional as in the improved_code).Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/simulation/m_data_output.fpp
**Line:** 120:123
**Comment:**
*Possible Bug: The call to `s_write_parallel_data_files` unconditionally passes several optional dummy arguments (e.g. `filtered_fluid_indicator_function`, `stat_q_cons_filtered`, `stat_filtered_pressure`, `stat_reynolds_stress`, `stat_eff_visc`, `stat_int_mom_exch`) from `s_write_data_files` even when they might not have been provided to `s_write_data_files`. In Fortran you must not pass an optional actual argument that is not present; this can lead to undefined behaviour or runtime/compile problems. Fix by calling the parallel writer with the additional optional arguments only when at least one of them is present; otherwise call the routine with only the mandatory arguments.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| if (present(beta)) then | ||
| alt_sys = sys_size + 1 | ||
| else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then | ||
| alt_sys = sys_size + volume_filter_dt%stat_size | ||
| else | ||
| alt_sys = sys_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The computation of alt_sys uses an if/else-if chain so only one extra contribution is added: either +1 for beta or +volume_filter_dt%stat_size for filtered statistics, but not both. If both beta and filtered statistics should be written in the same run step this loses one of them. Compute alt_sys by starting from sys_size and adding each contribution independently (check present(beta) and the filtered condition separately). [logic error]
Severity Level: Critical 🚨
- ❌ MPI restart variable count (NVARS_MOK) may be wrong.
- ❌ Restart/restart_data file layout corrupted.
- ⚠️ Filtered statistics output omitted unexpectedly.| if (present(beta)) then | |
| alt_sys = sys_size + 1 | |
| else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then | |
| alt_sys = sys_size + volume_filter_dt%stat_size | |
| else | |
| alt_sys = sys_size | |
| alt_sys = sys_size | |
| if (present(beta)) then | |
| alt_sys = alt_sys + 1 | |
| end if | |
| if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then | |
| alt_sys = alt_sys + volume_filter_dt%stat_size |
Steps of Reproduction ✅
1. In s_write_parallel_data_files (src/simulation/m_data_output.fpp) the altered system
size
alt_sys is computed at lines 848-854 using an if / else if chain.
2. Scenario: both a beta correction is present (present(beta) true) and
filtered-statistics are
scheduled to be written (q_filtered_wrt true and t_step matches 0 or t_step_stop).
3. Current code picks the first branch (present(beta)) and sets alt_sys = sys_size + 1,
skipping the addition of volume_filter_dt%stat_size. Later the code uses alt_sys when
deciding how many variables to write (for example the do i = 1, alt_sys loop added at
src/simulation/m_data_output.fpp line 1019 of the PR hunk), and when setting NVARS_MOK
for MPI offsets (NVARS_MOK = int(alt_sys, MPI_OFFSET_KIND) in the MFC_MPI path).
4. Observe: the file layout/loop count will omit the filtered-statistics block (or omit
the
beta block depending on ordering), producing truncated or misaligned MPI writes and
corrupted or incomplete restart/statistics files. The improved_code sums both
contributions.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/simulation/m_data_output.fpp
**Line:** 848:853
**Comment:**
*Logic Error: The computation of `alt_sys` uses an if/else-if chain so only one extra contribution is added: either `+1` for `beta` or `+volume_filter_dt%stat_size` for filtered statistics, but not both. If both `beta` and filtered statistics should be written in the same run step this loses one of them. Compute `alt_sys` by starting from `sys_size` and adding each contribution independently (check `present(beta)` and the filtered condition separately).
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.|
CodeAnt AI Incremental review completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/simulation/m_ibm.fpp (1)
76-83: Inconsistent allocation logic between 2D and 3D cases.In the 3D case (lines 70-75),
levelset%sfandlevelset_norm%sfare only allocated whenstore_levelsetis true. However, in the 2D case (lines 79-82), these arrays are allocated unconditionally. Since deallocation at lines 1341-1344 is guarded bystore_levelset, this creates a memory leak in 2D whenstore_levelset=F.🐛 Suggested fix
else @:ALLOCATE(ib_markers%sf(-buff_size:m+buff_size, & -buff_size:n+buff_size, 0:0)) - @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, & - -buff_size:n+buff_size, 0:0, 1:num_ibs)) - @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, & - -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3)) + if (store_levelset) then + @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, & + -buff_size:n+buff_size, 0:0, 1:num_ibs)) + @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, & + -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3)) + end if end if
🤖 Fix all issues with AI agents
In `@src/common/m_ib_patches.fpp`:
- Around line 621-644: The GPU parallel loop that marks ib_markers_sf inside the
periodic_ibs block uses $:GPU_PARALLEL_LOOP with private='[cart_y, cart_z, ix,
iy, iz]' but omits the loop indices i, j, k; update the private list for the
GPU_PARALLEL_LOOP surrounding the nested do k=0,p / do j=0,n / do i=0,m block
(the periodic_ibs section that sets ib_markers_sf based on patch_id, center, r2
and calls s_convert_cylindrical_to_cartesian_coord) to also include i, j, k so
those loop counters are private on the device and avoid potential data races.
In `@src/simulation/m_data_output.fpp`:
- Around line 964-968: The call to s_initialize_mpi_data_filtered dereferences
optional arguments when q_filtered_wrt is true (at the t_step check) without
verifying their presence; update the conditional to require present(...) for
each optional actual (filtered_fluid_indicator_function, stat_q_cons_filtered,
stat_filtered_pressure, stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
before calling s_initialize_mpi_data_filtered, and if any are absent emit a
clear error (abort) explaining the missing argument(s) so the routine is never
called with absent optionals.
🧹 Nitpick comments (2)
src/simulation/m_ibm.fpp (1)
682-726: Remove unused variables and commented-out code.Variables
subsection_x,subsection_y,subsection_z(lines 682-684) are declared but only used in commented-out code (lines 716-725). This creates dead code and confusion. Either remove the declarations and comments, or document why this alternative implementation is preserved for future use.♻️ If removing unused code
integer, dimension(2*gp_layers + 1, 2*gp_layers + 1, 2*gp_layers + 1) & :: subsection_3D - integer, dimension(2*gp_layers + 1) :: subsection_x - integer, dimension(2*gp_layers + 1) :: subsection_y - integer, dimension(2*gp_layers + 1) :: subsection_z integer :: i, j, k!< Iterator variablesAnd remove the commented blocks at lines 716-725.
src/common/m_mpi_common.fpp (1)
1606-1616: Consider adding minimum cell count validation for slab decomposition.The slab decomposition branch only checks divisibility (
mod((p + 1), num_procs_z) == 0) but doesn't validate that each processor receives at leastnum_stcls_min*recon_ordercells in the z-direction, unlike the other decomposition branches. This could allow invalid decompositions with too few cells for the stencil.♻️ Suggested validation
else if (slab_domain_decomposition) then num_procs_x = 1 num_procs_y = 1 num_procs_z = num_procs ierr = -1 - if (mod((p + 1), num_procs_z) == 0) then + if (mod((p + 1), num_procs_z) == 0 .and. & + (p + 1)/num_procs_z >= num_stcls_min*recon_order) then ierr = 0 end if
| if (periodic_ibs) then | ||
| $:GPU_PARALLEL_LOOP(collapse=3, private='[cart_y, cart_z, ix, iy, iz]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]') | ||
| do k = 0, p | ||
| do j = 0, n | ||
| do i = 0, m | ||
| if (grid_geometry == 3) then | ||
| call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k)) | ||
| else | ||
| cart_y = y_cc(j) | ||
| cart_z = z_cc(k) | ||
| end if | ||
| do ix = 1, 2 | ||
| do iy = 1, 2 | ||
| do iz = 1, 2 | ||
| if (((x_cc(i) - center(1, ix))**2 + (cart_y - center(2, iy))**2 + (cart_z - center(3, iz))**2) <= r2) then | ||
| ib_markers_sf(i, j, k) = patch_id | ||
| end if | ||
| end do | ||
| end do | ||
| end do | ||
| end do | ||
| end do | ||
| end do | ||
| end do | ||
| $:END_GPU_PARALLEL_LOOP() | ||
| $:END_GPU_PARALLEL_LOOP() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing loop indices in GPU parallel loop private list.
The private list on line 622 includes [cart_y, cart_z, ix, iy, iz] but omits the outer loop indices i, j, k. Other GPU loops in this file (e.g., lines 170, 646, 709) consistently include loop indices in their private lists. Missing these could cause data races or incorrect behavior on GPU.
🐛 Proposed fix
-$:GPU_PARALLEL_LOOP(collapse=3, private='[cart_y, cart_z, ix, iy, iz]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i, j, k, cart_y, cart_z, ix, iy, iz]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]')🤖 Prompt for AI Agents
In `@src/common/m_ib_patches.fpp` around lines 621 - 644, The GPU parallel loop
that marks ib_markers_sf inside the periodic_ibs block uses $:GPU_PARALLEL_LOOP
with private='[cart_y, cart_z, ix, iy, iz]' but omits the loop indices i, j, k;
update the private list for the GPU_PARALLEL_LOOP surrounding the nested do
k=0,p / do j=0,n / do i=0,m block (the periodic_ibs section that sets
ib_markers_sf based on patch_id, center, r2 and calls
s_convert_cylindrical_to_cartesian_coord) to also include i, j, k so those loop
counters are private on the device and avoid potential data races.
| if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then | ||
| call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, & | ||
| stat_q_cons_filtered, stat_filtered_pressure, & | ||
| stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) | ||
| end if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard optional filtered args before initializing MPI data.
Line 964 dereferences optional arguments without present() checks when q_filtered_wrt is true. If any are omitted, behavior is undefined.
✅ Proposed fix (add presence guard + abort)
- if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
- call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
- stat_q_cons_filtered, stat_filtered_pressure, &
- stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
- end if
+ if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
+ if (.not. (present(filtered_fluid_indicator_function) .and. &
+ present(stat_q_cons_filtered) .and. present(stat_filtered_pressure) .and. &
+ present(stat_reynolds_stress) .and. present(stat_eff_visc) .and. &
+ present(stat_int_mom_exch))) then
+ call s_mpi_abort('q_filtered_wrt requires all filtered-stat arguments in s_write_parallel_data_files.')
+ end if
+ call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
+ stat_q_cons_filtered, stat_filtered_pressure, &
+ stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
+ end if🤖 Prompt for AI Agents
In `@src/simulation/m_data_output.fpp` around lines 964 - 968, The call to
s_initialize_mpi_data_filtered dereferences optional arguments when
q_filtered_wrt is true (at the t_step check) without verifying their presence;
update the conditional to require present(...) for each optional actual
(filtered_fluid_indicator_function, stat_q_cons_filtered,
stat_filtered_pressure, stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
before calling s_initialize_mpi_data_filtered, and if any are absent emit a
clear error (abort) explaining the missing argument(s) so the routine is never
called with absent optionals.
User description
User description
Description
Please include a summary of the changes and the related issue(s) if they exist.
Please also include relevant motivation and context.
Fixes #(issue) [optional]
-Overarching goal of investigating unclosed terms present in the volume filtered Euler-Lagrange framework
-Implements explicit computation of the unclosed terms present in the volume filtered momentum equation in the Euler-Lagrange framework for particle laden flows
-Explicit immersed boundary drag force computation
-General field statistics computation (1st-4th order statistical moments)
-Periodic spherical immersed boundaries
-Bulk flow periodic forcing
Type of change
Please delete options that are not relevant.
Scope
If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration
Test Configuration:
-MacOs
-Linux (Ubuntu)
-NCSA Delta Nvidia A100 GPUs
Checklist
docs/)examples/that demonstrate my new feature performing as expected.They run to completion and demonstrate "interesting physics"
./mfc.sh formatbefore committing my codeIf your code changes any code source files (anything in
src/simulation)To make sure the code is performing as expected on GPU devices, I have:
nvtxranges so that they can be identified in profiles./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace, and have attached the output file and plain text results to this PR.PR Type
Enhancement, Tests
Description
Implements comprehensive volume filtering framework for particle-laden flows with explicit computation of unclosed terms in the Euler-Lagrange framework
Adds FFT-based filtering infrastructure supporting both CPU (FFTW) and GPU (cuFFT) backends with MPI parallelization
Computes key quantities including Reynolds stress tensors, effective viscosity, and interphase momentum exchange terms
Implements periodic spherical immersed boundaries with on-the-fly levelset computation when storage is disabled
Adds periodic bulk flow forcing based on space-time averaged quantities with fluid indicator function weighting
Extends MPI communication infrastructure with scalar field send/receive buffers and global domain bounds computation
Implements statistical moment computation (1st-4th order) using Welford's online algorithm with GPU parallelization
Adds support for writing volume-filtered conservative variables, pressure, Reynolds stress, effective viscosity, and interphase momentum exchange
Includes 5 new test cases demonstrating volume filtering, periodic forcing, and particle drag computation
Increases maximum immersed boundary patches from 10 to 1000 to support larger particle-laden systems
Adds comprehensive case validation for volume filtering and periodic forcing requirements
Includes utility scripts for generating 3D Voronoi-based sphere distributions for uniform particle packing
Diagram Walkthrough
File Walkthrough
26 files
m_volume_filtering.fpp
Volume filtering module for Euler-Lagrange particle-laden flowssrc/simulation/m_volume_filtering.fpp
flows with explicit computation of unclosed terms in the
Euler-Lagrange framework
and GPU (cuFFT) backends with MPI parallelization
viscosity, and interphase momentum exchange terms
management for immersed boundary methods
m_data_output.fpp
Add volume-filtered data output capabilities to I/O modulesrc/simulation/m_data_output.fpp
s_write_data_filesands_write_parallel_data_filessubroutineswith optional parameters for filtered quantities
pressure, Reynolds stress, effective viscosity, and interphase
momentum exchange
and final time steps when
q_filtered_wrtflag is enableds_initialize_mpi_data_filteredfor MPI-based initializationof filtered field data structures
m_time_steppers.fpp
Add volume filtering module dependencies to time stepperssrc/simulation/m_time_steppers.fpp
m_volume_filteringandm_compute_statisticscapabilities into time-stepping routines
m_mpi_common.fpp
MPI scalar field communication and global domain boundssrc/common/m_mpi_common.fpp
memory management
domain_glbarray to store global domain bounds across allprocessors
s_mpi_sendrecv_variables_buffers_scalarfieldsubroutinefor scalar field MPI communication with pack/unpack operations
s_mpi_global_domain_boundssubroutine to compute and broadcastglobal domain boundaries
distribution
m_start_up.fpp
Volume filtering and statistics computation integrationsrc/simulation/m_start_up.fpp
additional forcing
store_levelsetflag isenabled
statistics calculation in main time-stepping loop
filtered data
statistics modules
m_compute_statistics.fpp
Statistical moments computation for volume filtered variablessrc/simulation/m_compute_statistics.fpp
for unclosed terms
interphase momentum exchange, and filtered conservative variables
moments with GPU parallelization
module
m_ibm.fpp
Periodic spherical immersed boundaries and on-the-fly levelsetsrc/simulation/m_ibm.fpp
all three directions
spherical boundaries when
store_levelsetis disabledboundaries
of full 3D subsection
m_data_input.f90
Post-processing support for filtered variables and statisticssrc/post_process/m_data_input.f90
statistical moment arrays
s_allocate_filtered_arrayssubroutine to allocate filteredvariable arrays
s_read_parallel_filtered_datasubroutine to read volume filtereddata from restart files
routine
m_additional_forcing.fpp
Periodic bulk flow forcing for particle-laden flowssrc/simulation/m_additional_forcing.fpp
averaged quantities
indicator function weighting
m_boundary_common.fpp
Scalar field boundary condition populationsrc/common/m_boundary_common.fpp
s_populate_scalarfield_bufferssubroutine for populating scalarfield boundary buffers
s_periodic_scalarfieldpure subroutine for periodicboundary conditions on scalar fields
scalar fields
m_ib_patches.fpp
Periodic sphere initialization with domain wrappingsrc/common/m_ib_patches.fpp
of sphere centroids
mark grid cells accordingly
projections
m_compute_levelset.fpp
Periodic levelset computation for spherical boundariessrc/common/m_compute_levelset.fpp
for spheres
domain boundaries
nearest surface
computation
m_mpi_proxy.fpp
MPI parameter broadcasting for new featuressrc/simulation/m_mpi_proxy.fpp
t_step_start_stats,mom_f_idx,forcing_windowperiodic_ibs,compute_particle_drag,periodic_forcing,forcing_wrt,volume_filtering_momentum_eqn,store_levelset,slab_domain_decomposition,q_filtered_wrtu_inf_ref,rho_inf_ref,P_inf_ref,filter_width,forcing_dt,fluid_volume_fractionm_start_up.fpp
Pre-processing domain bounds initializationsrc/pre_process/m_start_up.fpp
s_mpi_global_domain_boundsto compute global domainbounds during preprocessing
and levelset storage flags
m_global_parameters.fpp
Global parameters for volume filtering and periodic forcingsrc/simulation/m_global_parameters.fpp
and immersed boundary features
s_assign_default_values_to_user_inputsfor volume filtering
m_start_up.fpp
Post-processing output for filtered quantities and statisticssrc/post_process/m_start_up.fpp
q_filtered_wrtis enabled(Reynolds stress, effective viscosity, interfacial momentum exchange,
filtered conserved variables, filtered pressure)
m_data_output.fpp
Conditional levelset storage in restart filessrc/pre_process/m_data_output.fpp
store_levelsetflagwriting
store_levelsetis falsem_global_parameters.fpp
Post-process global parameters for filtering and IB featuressrc/post_process/m_global_parameters.fpp
levelset storage, domain decomposition, and filtered output
q_filtered_wrtis enabledm_mpi_proxy.fpp
MPI broadcast for new immersed boundary parameterssrc/pre_process/m_mpi_proxy.fpp
periodic_ibs,store_levelset,slab_domain_decompositionm_mpi_proxy.fpp
MPI broadcast for post-process filtering parameterssrc/post_process/m_mpi_proxy.fpp
periodic_ibs,store_levelset,slab_domain_decomposition,q_filtered_wrtm_rhs.fpp
Periodic forcing integration into RHS computationsrc/simulation/m_rhs.fpp
m_additional_forcingprofiling ranges
s_compute_periodic_forcingwhenperiodic_forcingis enabledm_global_parameters.fpp
Pre-process global parameters for periodic IB supportsrc/pre_process/m_global_parameters.fpp
domain decomposition
s_assign_default_values_to_user_inputsp_main.fpp
Volume filtering initialization in main simulationsrc/simulation/p_main.fpp
m_volume_filteringforcing
when volume filtering is enabled
m_constants.fpp
Increased maximum patches limit for IB supportsrc/common/m_constants.fpp
num_patches_maxfrom 10 to 1000 to support larger numbers ofimmersed boundaries
case_validator.py
Case validation for volume filtering and periodic forcingtoolchain/mfc/case_validator.py
num_ibsvalidation limit from 10 to 1000periodic BCs
check_periodic_forcingmethod validating required parameters forperiodic forcing
check_volume_filteringmethod validating filter width, domaindecomposition, and grid requirements
validate_commonmethodcase_dicts.py
Case dictionary parameter definitions for new featurestoolchain/mfc/run/case_dicts.py
support
SIMULATION
and periodic forcing
q_filtered_wrtto POST_PROCESS dictionary1 files
delta.mako
GPU memory allocation increase for Delta systemtoolchain/templates/delta.mako
NCSA Delta
6 files
case.py
Test case for volume filtered particle-laden flowruns/phi01/case.py
computation
case.py
Single sphere volume filtering test caseruns/3d_1sphere_filtering/case.py
writing
case.py
Periodic immersed boundary test caseruns/3d_1sphere_periodic/case.py
case.py
Off-center periodic IB test configurationruns/3d_periodic_ibs_test/off-center/case.py
case.py
Centered periodic IB test configurationruns/3d_periodic_ibs_test/centered/case.py
case.py
Drag force computation test caseruns/3d_drag_test/case.py
3 files
gen_voronoi_3D.py
3D Voronoi sphere distribution generatorvoronoi/gen_voronoi_3D.py
distributions
gen_voronoi_2D.py
2D Voronoi circle distribution generatorvoronoi/gen_voronoi_2D.py
distributions
sphere_array_locations.txt
Sphere location data for phi=0.1 test caseruns/phi01/sphere_array_locations.txt
flow test case
distribution
Summary by CodeRabbit
Release Notes
New Features
Enhancements
New Examples & Test Cases
✏️ Tip: You can customize this high-level summary in your review settings.
CodeAnt-AI Description
Volume-filtered Euler–Lagrange: explicit unclosed-term computation, periodic IBM, and particle drag output
What Changed
Impact
✅ Clearer particle drag outputs✅ Volume-filtered momentum terms available for analysis✅ Support for periodic immersed boundaries and periodic forcing💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.